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Abstract 



In this article, we focus on the sampling of the configurational Gibbs-Boltzmann 
distribution, that is, the calculation of averages of functions of the position coordinates 
of a molecular TV-body system modelled at constant temperature. We show how a formal 
series expansion of the invariant measure of a Langevin dynamics numerical method can 
be obtained in a straightforward way using the Baker-Campbell-Hausdorff lemma. We 
then compare Langevin dynamics integrators in terms of their invariant distributions and 
demonstrate a superconvergence property (4th order accuracy where only 2nd order would 
be expected) of one method in the high friction limit; this method, moreover, can be 
reduced to a simple modification of the Euler-Maruyama method for Brownian dynamics 
involving a non-Mar kovian (coloured noise) random process. In the Brownian dynamics 
case, 2nd order accuracy of the invariant density is achieved. All methods considered are 
efficient for molecular applications (requiring one force evaluation per timestep) and of a 
qq [ simple form. In fully resolved (long run) molecular dynamics simulations, for our favoured 

CN ] method, we observe up to two orders of magnitude improvement in configurational 

sampling accuracy for given stepsize with no evident reduction in the size of the largest 
usable timestep compared to common alternative methods. 
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1 Introduction 

Let U : R w — > R be the potential energy function of a classical model for a molecular 
system. A fundamental challenge is to sample the configurational Gibbs-Boltzmann (canonical) 
distribution with density 

Pp(x) = Z- 1 exp(-(3U(x)), (1) 

where /3 _1 = k B T where k B is Boltzmann's constant, T is temperature, and Z is a normalization 
constant so that pp has unit integral over the entire configuration space. A wide variety of 
methods are available to calculate averages with respect to pp\ among these, some of the most 
popular are based on Brownian dynamics or Langevin dynamics (defined in the phase space 
of positions and momenta) [H EJ [3J HI |5]. (In this article we focus exclusively on molecular 
dynamics techniques; molecular models can also be sampled using Monte-Carlo methods, and, 
more generally, using hybrid algorithms which combine molecular dynamics with a Metropolis- 
Hastings test in order to correct averages. For a recent review of such schemes see [6].) Recall 
that Brownian dynamics (overdamped Langevin dynamics) is a system of Ito-type stochastic 
differential equations of the form 



dx = -M~ l VU{x)dt + v / 2/3- 1 M" 1/2 dH/, x(0) = x , (2) 

where dW(t) is the infinitesimal increment of a vector of stochastic Wiener processes W(t), and 
M is a positive (we assume here diagonal) mass matrixlj A simple and popular method for 
numerical solution of Eq. [2] is the Euler-Maruyama method 

x n+l =x n - hM~ l VU(x n ) + ^2k B ThM- 1 ' 2 R n , 

where R n is a vector of random variables with standard normal distribution. This produces a 
sequence of points xq, x\,x<i, . . ., which, following a certain relaxation period, are approximately 
distributed according to the canonical invariant distribution. Euler-Maruyama has the property 
that the time averages along discrete trajectories, in the limit of large time (under appropriate 
conditions on the potential U(x) and assuming no effects from floating point rounding error), 
have error proportional to h. One of the observations of this article is that the simple 
modification 

It T 1 A 

x n+l =x n - hM- l VU(x n ) + ^-^-M- x ' 2 {R n + R n+1 ) (3) 

provides a second order approximation of stationary averages. 



^n terms of sampling the Gibbs distribution, M is in fact arbitrary, but it may be useful to allow for 
coordinate scaling. 



We arrive at this scheme by considering the large friction limit in a particular numerical 
method for Langevin dynamics. Recall that Langevin dynamics is a stochastic-dynamical 
system involving both positions and momenta p of the form 

dx = M^pdt, dp = [-W(x) - 7p]dt + aM 1/2 dW, (4) 

where W = W(t) is again a vector of N independent Wiener processes, and 7 > is a free 
parameter, the friction coefficient. The methods that are in fact the primary focus of this paper 
are splitting integrators that decompose the stochastic vector field of Langevin dynamics into 
simpler vector fields which can be solved exactly. The composition method that results cannot 
be directly related to a stochastic differential equation, so the analogy with backward error 
analysis for deterministic problems [T6J [XT] is incomplete, but nonetheless the invariant measure 
associated to the numerical method can be derived, using the Baker- Campbell-Hausdorff (BCH) 
expansion, as an asymptotic series in two parameters: the stepsize and the reciprocal of the 
friction coefficient. The superconvergence result alluded to above is then obtained in the high 
friction limit. 

The Langevin stepsize must be understood to be proportional to the square root of the 
stepsize that appears in Eq. El so in Langevin dynamics an effective 4th order approximation is 
obtained, but only for the marginal configurational invariant distribution, Eq. [TJ Our approach 
also provides a simple method for comparative assessment of the invariant measure of a class 
of Langevin integrators. 

Molecular dynamics is a large family of modelling techniques which is widely used in 
different application areas and for different purposes [18J. This article is addressed specifically 
to the topic of calculating averages with respect to an invariant distribution, and will probably 
be of highest interest for applications in molecular sampling, in particular the calculation 
of averages with respect to the configurational density Eq. [TJ The high friction limit 
renders Langevin dynamics unsuitable for dynamical modelling (except as a method for 
generating starting configurations for dynamical exploration). It is worth noting that invariant 
measure computations arise frequently in applications other than molecular modelling, and the 
techniques described here would be of potential use in many of these. 

In large scale simulations, it should be understood that the statistical error (dependent 
on the number of samples used) is typically the dominant concern. Our approach focuses 
on the truncation error of the invariant distribution, thus the greatest benefit would be seen 
only when the statistical error is well controlled. Nonetheless we observe in our numerical 
experiments that the least biased scheme from the point of view of the error introduced in 
configurational sampling is also as efficient as the alternatives and the most robust with respect 



to variation of the parameters (stepsize, friction coefficient), thus there is effectively no price 
for the improvement in accuracy. Moreover, it would be possible to complement the methods 
proposed here by procedures such as importance sampling [7j to further reduce the statistical 
error. 

With regard to the approximation of canonical averages, methods have previously been 
constructed for Brownian dynamics with order > 1 and for Langevin dynamics with order > 2 
[21 H], but these require multiple evaluations of the force; for this reason they are not normally 
viewed as competitive alternatives for molecular sampling [5] . By contrast all of the methods 
described in this article use a single force evaluation at each timestep. 

The approach used here may be compared to other recent works on stochastic numerical 
methods, and, in particular [HI QUI CHI EE21 EU EE5] • Our technique differs from these in (i) 
the direct focus on the stationary configurational distribution, and (ii) the use of the BCH 
expansion. Other articles (see e.g. [5]) which address the invariant measure in Langevin-type 
stochastic differential equations do not use the backward error analysis (and do not find the 
super convergent scheme). 

The rest of this article is as follows. In Section 2 we review necessary background on 
stochastic differential equations for sampling from the Gibbs measure. Section 3 presents our 
expansion of the associated perturbed invariant measure and calculations involving the use of 
the Baker-Campbell-Hausdorff theorem. Section 4 describes the reduction of the methods in 
the case of overdamped Langevin Dynamics. Section 5 demonstrates the theory obtained using 
numerical experiments to verify the results. 

2 Background 

The Ornstein-Uhlenbeck (OU) stochastic differential equation is an Ito equation of the form 

du = — judt + crdW, 

where u is a random variable defined for each time t, 7 and a are positive parameters and 
dW represents the infinitesimal increment of a Wiener process. Langevin dynamics (Eq. HJ) 
combines the Ornstein-Uhlenbeck stochastic vector field with conservative dynamics. 

For stochastic differential equations, the density evolves according to an evolution equation 
(the Fokker-Planck or forward Kolmogorov equation) of the form 

dp r * 



where C* is a second order differential operator. In the case of Langevin dynamics, the relevant 
Kolmogorov operator is defined by its action on a function of the variables of the system by 

2 

£* LD <j) = -M~ l P ■ V x <f> + W(x) ■ V p + 7 V P ■ {p4>) + y A p </>, 

where A p is the mass- weighted Laplacian in the momenta: A p = X^ m «a?- By choosing 
cr = ^/IksT^ one easily checks that the Gibbs distribution with density pp = Z~ l exp(—(3H) is 
a steady state of the Kolmogorov equation, where Z is a normalising constant and H(x,p) = 
p T M~ l p/2 + £/(a?) represents the Hamiltonian energy function. For later reference, we define 
averages with respect to the Gibbs distribution by 

E(0(x,p))= (f)(x,p)p/s(x,p)dxdp. 

Assuming U is C°° it is possible to demonstrate that the operator ££ D is hypoelliptic by 
using Hormander's criterion [19] based on iterated commutators, and this implies that the 
Gibbs measure is the unique steady state (up to normalization). Even stronger is the result 
of [20] which demonstrates the existence of a spectral gap for the operator £ld- Many of 
the challenges related to obtaining formal analytical results for stochastic differential equations 
relate to singularities of the potential and/or the assumption of an unbounded solution domain. 
However, with periodic boundary conditions and strong repulsive potentials (e.g. Lennard- 
Jones potentials) we observe that configurations typically evolve in a bounded set and remain 
far from singular points (the radial distribution vanishes in a large interval around the origin). 
Indeed it is a simple calculation to demonstrate that for a Lennard- Jones system with potentials 
<f(r) = (cr/r) 1 2 — 2(a/r) 6 the expected number of samples required to observe r G (0, cr/2) at 
unit temperature would involve a simulation of duration far greater than the age of the universe, 
due to the steepness of the potential close to the origin. In our simulations of a small Lennard- 
Jones cluster in Section 5, we did not observe a separation of two atoms beneath 0.7a in all 
of the nearly 10 11 timesteps performed to gather statistics; a separation less than 0.5er could 
only be seen at very large stepsizes, when instabilities due to other components of model, e.g. 
harmonic bonds, would have anyway rendered a typical molecular simulation useless. Thus it is 
somewhat of a moot point whether we simply assume that configurations stay well away from 
singular points (domain restriction) or that the potential has been smoothly cut off to remove 
the singularity; in no case will we encounter the atomic collision singularity in a simulation of 
the type envisioned here. 

For the purpose of deriving practical methods we assume that (i) the positions are confined 
to a periodic simulation box Q = L X T x L y T x L Z T, where T = K/Z is the one- dimensional 
torus, and (ii) U is C°° on Q. These assumptions, which are realistic in most molecular dynamics 



applications for the reasons mentioned above, allow us to use recent results in hypocoercivity 
[2TJ [22J E3] to establish the regularity properties of £ld- Specifically, we have as a consequence 
of these articles, that 

• £l D / = nas a un iq ue solution in C°°(Q), i.e., the Gibbs density pp, 

• £ld nas a compact resolvent [211 [22], and the Gibbs state is therefore exponentially 
attracting. 

Note that, if, for some given g, there are two solutions of £ld/ = 9 then, using the linearity of 
the operator these may differ only in a constant scaling of the Gibbs density. 



2.1 Timestepping Methods. 

In a splitting method for a deterministic system z = f(z), one divides the vector field / into 
exactly solvable parts, i.e. f = fi + /2, which are treated sequentially within a timestep. 
An example of such a splitting method for the Hamiltonian system with energy H(x,p) = 
J2iPl/(^ rn i) + U(xi,X2, • • • , xn) is the "symplectic Euler" method defined by /i = J2i "^h Pi^x^ 
J2 = — Y^,i a^dpi- By dividing the vector field as / = ~/i + f% + |/i, solving each vector field 
in turn, we obtain the so-called position Verlet method, and by switching the roles of f\ and 
f2 we obtain the velocity Verlet method. Splitting methods like these are explicit and this 
feature is of particular importance in molecular dynamics, where the force calculation is the 
usual measure of per-timestep computational complexity. 

In a similar way, Langevin dynamics may be treated by splitting [9J, [12], [13]. For example, 
one may divide the Langevin system (Eq. HJ) into three parts 



X 

p. 


= 


M~ l p 



dt + 




-vu 


dt + 







-7pdt + aM 1 ' 2 dW 



(5) 



A B O 

and each of the three parts may be solved 'exactly'. In the case of the OU part (labelled here 
simply as O) we mean by this that we realize the stochastic process by the equivalent formula 



p{t) = e" 7 'p(0) + (J^(l - e- 2 '< t )f3- 1 M l/2 R{t), 



(6) 



where R(t) is a vector of uncorrelated independent standard normal random processes (white 
noise). One method based on the splitting in Eq. [S]is defined by the composition 



<bao = exp((<Jt/2)£ A ) exp(<5££ B ) exp((<Jt/2)£ A ) exp(StC ), 

where exp(St£f) represents the phase space propagator associated to the (deterministic or 
stochastic) vector field /, and we use St as the timestep in Langevin dynamics (later we use h 



for the timestep of an associated Brownian dynamics). The deterministic part is approximated 
by the position Verlet method. This is referred to as a Geometric Langevin Algorithm of order 
two (GLA-2) following [TJ]; an alternative is to use velocity Verlet for the Hamiltonian part 
(V'babo)- A simple generalization of GLA methods is obtained by interspersing integrators 
associated to parts of the Hamiltonian vector field with exact OU solves, thus we have 

<boba = exp((5t/2)C A ) exp((<Jt/2)£ B ) exp(5tC ) exp((5t/2)£ B ) exp((5i/2)£ A ), 

and V'baoab defined in an analogous way. Recent work in [21] similarly uses exact OU solves 
to give an integrator equivalent to ^obabo (i n our notation), though with a reparameterised 
timestep. The analysis technique we use can employed to study many such integrators, though 
for brevity we will limit this article to discussion only on a select few interesting cases. 

An alternative integrator termed the Stochastic Position Verlet (SPV) method [9j [12], relies 
on the splitting 



x 

P 



M~ x p 




d£ 







- VUdt - 7£>d£ + aM 1 / 2 dw 
SPV is not a (generalized) GLA- type method, although it, as each of the generalized GLA 
schemes, is quasisymplectic in the language of [10]. Likewise the commonly used method of 
Brunger, Brooks and Karplus (BBK) [1] is not of the (generalized) GLA family. Details of all 
methods examined are given in the Appendix. 

To slightly simplify the presentation that follows, we make the change of variables 
q —?• M _1 / 2 g, p — > M +1 / 2 p, with a corresponding adjustment of the potential; this is equivalent 
to assuming M = I. 

3 Expansion of the Invariant Measure 

We shall work here with formal series expansions, however we expect that our derivation 
could be rigorously founded using techniques found in [25] and [26]. Associated to any given 
splitting-based method for Langevin dynamics, we define the operator £* that characterises the 
propagation of density by an expansion of the form: 

£* = £* LD + 5t£* 1 + 5t 2 £* 2 + 0(5t 3 ), (7) 

For example, for the method labelled by the string ABAO, we have 

exp(5t£ ABAO ) = exp((5t/2)£* A ) exp(5t£* B ) exp((5t/2)£* A ) exp(5t£* Q ). 

The perturbation series may be found by successive applications of the BCH expansion [16] and 
linearity properties of the Kolmogorov operator. However, unlike in the deterministic case, the 
terms that appear in the series cannot be associated to modified vector fields or even SDEs [27] . 
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Note that, when iterated n + 1 times, the method ABAO produces a sequence of the form 



thus, with a minor coordinate transformation, the dynamics sample the same invariant density 
as BAOAB. Similarly BABO and OBAB are essentially the same method as ABOBA. For this 
reason we concentrate in the remainder of this article on ABOBA and BAOAB. For these two 
methods, the symmetry implies that the odd order terms in Eq. [7] vanish identically using the 
Jacobi identity in the BCH expansions. 

After deriving C* in this way, we seek the invariant distribution which satisfies C*p = 0. 
For the BAOAB and ABOBA methods, we make the ansatz that the invariant measure of the 
numerical method has the simple form 

p oc exp{-fi[H + St 2 f 2 + St 4 f 4 + . . .]). (8) 

Although some technical issues might be encountered, we believe that the existence of such an 
expansion can be made rigorous using techniques found in [23] , based on the regularity of the 
operator C^ D . We may rewrite this as 

pK P p(l-p5t 2 f 2 (x, P ) + 0(5t 4 )). 

This means that the equation C*p = becomes 

(C* LD + 5t 2 C* 2 + . . .) ( Pp - 5t 2 f3 P pf 2 + ...)= 0. 

Equating second order terms in St gives 

£ld {pph) = r l C* 2 pp. (9) 

The equation Cl D g = has unique solution g = pp, up to a constant multiple. Hence 
the homogeneous solution to the above PDE is f 2 (x,p) = c, for some constant c; we therefore 
require a particular solution f 2 of Eq. [9j According to the Fredholm Alternative, the equation 
has a solution provided that, for any solution of 

£ld# = 0, 
we have 



gC* 2 pp = 0. 
As the only solutions of Cu^g = are the constants, we require 

C* 2 pp = 0. (10) 



/ 
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3.1 Calculation of the inhomogeneity 

For a symmetric splitting method such as the BAOAB method, recall that we can use the 
Baker-Campbell-Hausdorff [TO] formula to find the overall one-step perturbation operator for 
the scheme. For linear operators X, Y and Z, we have the relation 

St v St 



e ^X e fY e 8tZ e fY e fX = e StS 



where 



5t 2 
S =X + Y + Z + — {[Z, [Z, Y + X]] + [Y, [Y, X}} + [Z, [Y, X}} + [Y, [Z, X}} 

-I [Y, [Y, Z\\ - \ [X, [X, Z\\ - \ [X, [X, Y]}) + 0(5t*). 

Here [X, Y] = XY — YX is the commutator of Y and X. 
In the case of the BAOAB method, we take 

X = C* B = VU{x) ■ V p , Y = C\ = - V ■ V x , 

and 

2 

Z = C Q = 7 V P ■ (p.) + °- A p , 

to compute the perturbed operator for the method. A similar analysis can be conducted for 
the other generalised GLA-type methods considered in this paper. 

To compute C* we simply plug in our choices into the BCH formula to obtain: 

C* = C* LB + 5t 2 C; + 0(5t i ), 

= ^LD + TIT ([Ad> [£o' ^DctJJ + i^A' [^A5 ^bJ] 

+ [C* Q , [CI, C* B ]} + [C* A , [C , C B ]] - | [C\, [CI, C* }} 
-\ [C B , [C B , C }} - \ [C* B , [C B , C\}]) + 0(St% 



where recall that 



and 



Dot _ ^-A i" *-B? 






The calculation of the inhomogeneity in (8) then amounts to a straightforward computation of 
the commutator series applied to pp. The commutators needed are: 

[C* A , [C* A , C* B )] P p =2f3p T U"(x)VU(x) P p - P pf3p- V x p T U'\x)p, 

[C* B , [C* B , C* A }] pp = - 2i3p T U"(x)VU{x)pp, 

[C* Q , [C* A , C B }} pp =2 7 (A x U(x) - (3p T U"(x)p) pp, 

[£\, [C* ,C B \]pp =7/3 {\VU(x)\ 2 - P T U"(x)p) Pp , 

[C\, [C* A , C* \] pp =2 7 {p\ VU{x)\ 2 - A x U{x)) pp, 

[C B ,[C B ,C* o }}pp=0 

[^O) t^O > ^Det]] P/3 =0: 

where we have abbreviated the Hessian V x V^t/(x) =: U"(x). 
Hence we see directly that 

12C* 2 pp = 3 7 (A x U(x) - f3p T U"(x)p) pp + 3/3p T U"(x)VU(x)pp - P p/3p- V x p T U"(x)p, 

giving 



£>* 2 Pf3 = P/3 



| {A x U(x) - f3p T U"(x)p) + ^p T U"(x)VU(x) -L v . V x p T U"{x)p 



• (11) 



Observe that Eq. [TU] is satisfied since the average of the first term is equivalent to a canonical 
average which vanishes 

E (A x U(x) - f3p T U"(x)p) = 0, 

whereas the other terms in Eq. [TT], being canonical averages of terms which are odd-order in p, 
necessarily also average to zero. 

An analogous computation can be performed for the ABOBA method, giving a slightly 
(but crucially) different perturbation operator £*( ABOBA ) ) where 



r * (ABOBA) 

A p/3 = -pp 



- a1 (A x U(x) - f3p T U"(x)p) + -Bp T U"{x)VU{x) - \f3p ■ V x p T U"(x)p 
4 ' 4 o 



Here U" is the Hessian matrix of U and A x = Y2i=i q~~? * s the partial Laplacian in x. Equation 
fTOl is again seen to be satisfied. 

3.2 Expansion in powers of 7 x 

Although we are not able to give the general analytical solution to the partial differential 
equation of Eq. [HI we can find a solution in an important limiting case: the high friction regime. 
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To do this, we expand the invariant density of the numerical method further, viewing both St 
and e = 7 _1 as small parameters: 

p = exp(-/3[H + 5t 2 (fo + foe + 0{e 2 )) + 0(5t 4 )]). (12) 

Dividing by 7, we may reduce Eq. [9] to 

[LI + eCTlifo + e/2,1 + 0(e 2 )) =g + e 9l , 



where 



and, for BAOAB, 



£* = -A p -p-V p , C\ = VU(q) ■ V p - p ■ V a 



g = t i {(3- 1 A x U(x)- P T U"(x)p), 

9i = \p T U"(x)VU(x) - ±p ■ V xP T U"(x)p. 

Note that this is a singularly perturbed system as Co is degenerate and it is only the combined 
operator that has the necessary regularity to define a unique solution. Nonetheless, as explained 
below it is possible to find the leading term / 2 ,o by substituting a truncated expansion of fixed 
degree and solving the resulting equations. 
Equating powers of e, we find 

£0/2,0 — fl'O) £1/2,0 + £0/2,1 = gi, 

and 

Cifo-i = -£0/2," ( for n>\). 
Truncating at n = 2, for example, we find the following solution of these equations: 

ho = / 2 B o A ° AB = I (P T U"(X) P - 0- X AU(x)) , 

/ 2 ,i = / 2 T AB = irWA^x) - j- 2 p T v xP T u"(x)p, 

/ 2 , 2 = / 2 B 2 AOAB = 2k/V,/V,/[/"(*)p - ±VU(x) ■ V x p T U'\x)p. 
For ABOBA, / 2 ,o the solution would change to 

/ 2 A o B ° BA = -| {p T U"(x)p - 2f3~ 1 AU(x)) . 

3.3 Marginal distribution 

We now turn out attention to the configurational marginal distribution obtained by integrating 
the density expansion Eq. [12] with respect to the momenta. Our interest is only in the leading 
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term of this expansion, which defines the sampling behavior for large 7. Ignoring higher order 
terms in St and e, we would have the distribution 

p = px(l + K), 

where, for BAOAB, p = p p x exp (-/3 ^ (p T U"(x)p - P~ l AU{x)) V and K = 0(e5t 2 ) + 
0(St 4 ). 

Noting that the leading term in the exponent of p is quadratic in momenta, we integrate 
out with respect to p to obtain 



sjJV- 



pd p = exp I — f3 



1 St 2 fit 2 



lA ; 



d> 



/ + -^-U" 



St 2 



Using the identity det(M) = exp(trace(log(M))), we find 

/ pd N p ex exp I —-trace I log I / H — —U" J J J x exp I —j3 



St 2 



We then Taylor expand the logarithm of / + ^j-U" and take the trace to obtain a cancellation 
of the St 2 terms, giving 

f pd N p oc exip(-pU + 0(St 4 )). 

The contribution to the configurational distribution error due to p is 0(5t A ). This means that 
the overall error in the marginal distribution of p (which includes the neglected factor 1 + 7V) 
will be 0(e5t 2 ) + 0(5t 4 ). 

If e is small (or 5t is relatively large), the error will be dominated by the quartic term in 
St and we will observe 4th order accuracy in configurational averages. For small St the method 
is always eventually second order. 

In the case of the ABOBA method, the remarkable cancellation of the second order errors 
does not occur and the method always exhibits 2nd order configuration distribution error. 

4 The Limit Method 

We now consider the limit 7 — > 00, where the exact solution of the vector OU process reduces to 
p = \/kBTM l l 2 R, where R is a vector whose components have a standard normal distribution 
(Gaussian white noise). Alternatively, we could consider the limit of the particle mass going 
to 0, although this requires a reformulation of Eq. H] so that the friction is proportional to 
the velocity instead of the momentum [6]. Whichever limit is taken, we would expect the 
ultimate result to be the same. (Here we have reintroduced the masses in order to present the 
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method, since they may be useful scaling parameters in simulation.) In the configurations it is 
straightforward to show that the BAOAB method therefore becomes 

5t i . . 5t i , . 

x n +i = x n + yM 1 F{x n ) + —M l (p n +Pn+i) 

= x n + ^-M" 1 F(x n ) + I ^/k^rM- l ' 2 {R n + R n+1 ), 

where the R n are vectors of i.i.d. standard normal random variables. Replacing 5t 2 /2 by h we 
arrive at Eq. [3j Since the Langevin scheme gives 4th order accurate configurational averages 
in this limit, we expect the method of Eq. [3] to be second order accurate in modified timestep 
h. Moreover, since we completely remove the second order term in the Langevin dynamics 
configurational density expansion, we expect to observe this behaviour across all values of h. 

By contrast the ABOBA scheme gives a much more complicated limit method as 7 — > 00 
which is not in one-step form. 

In the Euler-Maruyama method, the random perturbations introduced at each step are 
independent. In the method of Eq. 2, the random perturbation is a scaling of Z n = 
(R n + R n+ i)/y/2; the components Z n of these are independent of each other and decay linearly 
with timestep: 

<z£U«> = i«i&,ijfii-i> + {Bg>,R£>)) = 1 
(zg>,z®. 1 ) = (R®,R®) = 1/2 

(Z n l \Z n l) _ k ) = 0, fc = 2,3,... 

Thus, in this new method, we use a colored noise which has characteristics that directly depend 
on the stepsize, although the noise decorrelates in just a couple of timesteps. This is therefore no 
longer a Markov process, however it can be reformulated as such if one considers the appropriate 
extended space (eg. y n = [x n ,R n ,R n+1 ]). 

5 Numerical Experiments 

We implemented the methods ABOBA, BAOAB, SPV and BBK and compared the accuracy 
of configurational sampling for different values of 7 and a range of timesteps. A brief analysis 
shows that the use of the harmonic oscillator leads to special cancellations in the BCH series of 
the splitting schemes, making it a poor test subject. Hence, in order to compare the order of 
accuracy of the different schemes, we first considered an oscillator model in ID with potential 
U(x) = x 4 /4 + sin(l + 5x). This was accomplished by introducing M intervals ('bins') of equal 
length, and computing the mean error in the observed probability frequency compared to the 
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Figure 1: The graphs show the comparison of four different Langevin dynamics methods when 
applied using different stepsizes. The configurational distribution errors are plotted against 
the stepsize in a log- log scale. Here ksT = 1. The simulation time was fixed for all runs at 
t = 5 x 10 7 , and five runs were averaged to further reduce sampling errors. At left, 7 = 1, at 
right 7 = 50. The graphs are entirely in keeping with the theory presented in the article. 



exact expected frequency (obtained by integration of the probability density). If the observed 
frequency in bin i is Wj, and the exact expected frequency is a)j, then the error calculated is 

1 M 

Ermr= MEl Wi "^' ( 13 ) 

i=l 

In this one- dimensional example, we used 20 bins to cover the interval from —3.5 to 3.5. The 
configurational density error is plotted against stepsize in log-log scale. If Error oc 5t r then 
we expect this graph to be a line of slope r. Due to the relative simplicity of this model, we 
were able to perform highly resolved simulations to calculate accurate error estimates for the 
configurational distribution. The exact expected value that we compare experimental results 
against can be computed to arbitrary prescision, and we are able to run as many simulations as 
needed in order to drive the variance of results to a minimum. The variance in our results, in 
cases where the stepsize was less than 0.3, was consistently below 10~ 10 . Above a stepsize of 0.3 
some of the methods were found to be unstable. The results of our simulations are summarized 
in Figure [TJ 

As we can see, when 7 is small, the methods perform somewhat similarly, at least in 
the qualitative sense, with all showing a 2nd order error in configurational sampling, and the 
ABOBA and SPV methods essentially identical. As 7 is increased, the substantial difference 
between BAOAB and the other methods becomes apparent. In the limit of large 7 the SPV 
method effectively annihilates the force which results in poor sampling. In the graph for 7 = 1, 
we can see that for larger values of the stepsize, the graph steepens (indicating that the fourth 
order term is dominant); as the stepsize is decreased the method exhibits a 2nd order asymptotic 
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Figure 2: Computed distributions of the ID model problem are compared for 7 = 20. The 
three curves for each of the four methods show the results for three different stepsizes: St — 0.1 
(circles), St = 0.2 (crosses), St = 0.3 (dashed), compared to the dark, solid curve representing 
the exact distribution. The graph shows that the generalized GLA methods are superior to 
SPV and BBK in the moderate 7 regime. 



decay. With 7 = 50 the fourth order behavior is seen for for all indicated data points, although, 
again, this becomes second order for smaller values of St. Note that the limit method Eq. 
2 (with the substitution h = 5t 2 /2) gives an essentially identical behavior to the 7 = 50 case. 
We also give a comparison of the actual computed configurational distributions, at different 
stepsizes using each method, for 7 = 20 in Figure [2J 

To examine the performance of the limit method in more detail, we next considered small 
molecular clusters consisting of seven atoms (motion restricted to the plane), with both Morse 
((Pm) and Lennard- Jones (<£lj) potentials, given by 



^m(^) 

^Lj(r) 



(1 -exp(-a(r-r m ))) 2 
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Figure 3: The diagrams illustrate the distributions of interatomic distances, G(r), for Morse 
(left) and Lennard- Jones (right) clusters. They also show the choice of bins used in calculating 
the numerical distributions. 



where we use a = 2, e = 1 and r r 
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1. The overall potentials are hence 

7 7 

i=l j>i 

7 7 



EE 



<PL3\r. 



1,1 > 



2^ 8 ' 
fc=l 



i=l jf>i 

where r^ is the distance between particles i and j, r^ is the distance between particle k and the 
origin, and a mild harmonic term is included in the case of the Lennard- Jones system in order 
to prevent particles being ejected from the cluster. 

The Morse potential gives a smoother dynamics compared to Lennard- Jones (the Morse 
forces were on average three times smaller than those for Lennard- Jones) and allows a more 
satisfying determination of the error scaling behavior with stepsize. To quantify the error in 
configurational sampling, we calculated the radial density G(r) by binning the instantaneous 
interatomic distances at each step into 20 compartments and compared to a calculated reference 
value (Figure Ej). Though not exact, the errors in the reference will be negligible compared 
with configurational distribution errors at higher stepsizes. For the Morse cluster the reference 
stepsize we used was h ie { = 0.001, whereas for the Lennard- Jones cluster, we used h Te f = 0.00025, 
both with the same integration time that was used in their respective test runs. In both cases 
the cluster was initialized with 6 particles placed on a regular hexagon with unit side-length 
and with the remaining particle in the centre, and initial velocities randomly drawn from the 
canonical distribution. 
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Considerable computation is required to achieve the level of accuracy required, due to the 
dominance of sampling error and the complexity of the system compared to the earlier ID 
example. We ran a large number of independent simulations at each stepsize and computed the 
average radial density plot for both examples, and compared this to the reference result. 

Our results, presented in Figure HI are entirely consistent with our analysis and show the 
second order dependence of the configurational sampling error on h (equivalent to fourth order 
in St), as compared to the Euler-Maruyama's first order behavior. These results demonstrate 
a good agreement with our theoretical results, however even using extensive computation the 
variances in each experiment were still quite high. If uih, n ,m is the density of bin m in simulation 
n at stepsize h, we calculate the variance as 

n m / Ar 

n=l m=l \ 

where we used M = 20 bins for each experiment, N = 200 for the Morse experiment and 
N = 1000 for the Lennard- Jones experiment. The variances for the Morse experiment were 
around 10~ 10 while the Lennard Jones experiment had variances around 10 -8 . We expect 
that completing more simulations (increasing N) would reduce the variance and give smoother 
results in Figure HI Nonetheless, in both cases, we observe a significant reduction in the error 
compared to Euler-Maruyama. 

It might be a suggested that the improved accuracy seen in the high 7 regime could 
potentially come at the price of a slower convergence to equilibrium due to a reduced rate of 
transition between metastable states, hence overall sampling of the configurational distribution 
might be impaired in favour of local sampling. To address this point, we have plotted in 
Figure [5] the error computed in our Lennard- Jones simulation as a function of the number of 
force evaluations (vertical) against the friction value 7 used for the simulation (horizontal). 
Gridpoints in the plot are coloured according to the configurational error, computed using Eq. 
[TBI The results indicate that the convergence rate for the BAOAB method is not diminished 
for large friction coefficient, so it does not appear that we are sacrificing sampling accuracy 
using this scheme. Note that the performance of ABOBA is also robust in the limit of large 7, 
but the achievable accuracy is reduced as it is only second order, consistent with what we have 
presented in this article. 

6 Conclusions 

Our results confirm the theoretical results of Sections 3 and 4, in particular showing the higher 
order configurational sampling of the BAOAB method. The order in its Langevin formulation 
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is effectively four in the large 7 limit and large values of 7 do not impair its stability due to the 
use of exact Ornstein-Uhlenbeck solves. Not only is the order of the method high, the constant 
multiplying the leading term must be, in the cases looked at here, of modest magnitude, since 
the errors are relatively small also at the large stepsize stability threshold. Since, in the context 
of molecular dynamics, BAOAB is a 'cheap' and easy to implement scheme using only a single 
force vector per timestep, we stress that there is no price to pay for its improved accuracy. 

In molecular modelling, there are other errors that play important roles, most importantly 
errors in the force fields (or, more fundamentally, the errors due to not modelling quantum 
mechanics properly) and sampling errors. Obviously these errors may dominate the overall 
method error and limit the relative benefit to be gained by using one integrator as compared to 
another, but it is also clear that both of the other types of errors are constantly being reduced 
through the design of better models and the use of more powerful computers. More important, 
one can ask the question: how can a practitioner know which part of the error in a given 
complicated simulation is due to sampling error and which part due to the truncation errors 
addressed here? In our experiments with molecular models, even where there was substantial 
sampling error still present, we nonetheless found the accuracy to be noticeably higher for the 
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Figure 4: The radial distribution errors are plotted in log-log scale against stepsize, demon- 
strating the first order decay of the error in the case of Euler-Maruyama and the second-order 
behavior of the BAOAB limit method (in modified timestep h). Left: Morse potential; Right: 
Lennard- Jones. For Morse we used a temperature of ksT = 0.1, a fixed time interval of 
t = 4 x 10 6 , with stepsizes ranging from 0.0075 to 0.0225. For Lennard- Jones the tempera- 
ture was ksT = 0.2, t = 2.5 x 10 5 and stepsizes ranged from .001 to 0.0022. In order to drive 
the variance of the results down, a large number of runs were necessary: for the Morse simula- 
tion the error is computed using the average histogram computed from 200 independent runs, 
while the Lennard- Jones simulation used 1000 independent runs. Around two orders of magni- 
tude of improvement are observed in the accurate regime, but, perhaps even more important, 
the BAOAB limit method is usable at substantially larger stepsizes than Euler-Maruyama. 
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Figure 5: The diagrams illustrate the performance of the different algorithms (labelled) by 
showing the error in the computed radial distribution functions as a function of both 7 and 
the number of timesteps (samples) taken, in the case of the 7 atom Lennard- Jones model. 
The graphs address the potential concern that the larger values of 7 needed to give the 
superconvergence property may reduce the rate of convergence to equilibrium (it does not, 
in the case of BAOAB). The same stepsize of St = 0.044 was used for all these simulations. 

BAOAB method; it is likely that this improvement in sampling accuracy would be of direct 
benefit in many real world simulations. Finally we point out that the BAOAB scheme and its 
limit method (Eq. 2) was, in each case studied, stable at a larger stepsize than the alternatives, 
meaning that longer time intervals are made accessible. This was particularly dramatic in the 
case of the Lennard- Jones system. 
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Appendix: Langevin Dynamics Integrators 

The Langevin dynamics methods used for the numerical experiments in this paper are given 
here. Ri is an N— vector of i.i.d. Normal random numbers, with mean and variance 1. 
The diagonal mass matrix is denoted M, and we assume a timestep St is provided. Given the 
parameter 7, we define useful constants 



ci = e"^, c 2 = 7 - 1 (l-c 1 ; 



and 



c 3 = Jk B T(l-cj] 



'BAOAB' method 

Pn+l/2 =p n - 5tVU{x n )/2\ 
x n +\/2 = x n + StM~ 1 p n+1/2 /2; 

Pn+l/2 = ClPn+1/2 + C 3 M 1/2 R n+1 ; 

x n+1 = x n+1/2 + 5tM- 1 p n+1/2 /2; 
p n+1 = Pn+1/2 - StVU(x n+ i)/2; 

'ABOBA' method 

^n+1/2 = x n + 5tM~ 1 p n /2; 

Pn+l/2 =Pn~ StVU(x n+1/2 )/2; 
Pn+l/2 = ClP n+ l/2 + C 3 M l/2 R n+1 ] 

p n+1 = pn+1/2 - StVU(x n+ i/ 2 )/2; 
x n+1 = x n+1/2 + StM~ 1 p n+1 /2; 

Stochastic Position Verlet (SPV) 

x n+1/2 = x n + 5t M~ 1 p n /2; 
p n+1 = dPn - c 2 VU{x n+1 / 2 ) + c 3 M 1/2 R 
x n+ i = x n+1/2 + 5tM~ 1 p n+1 /2; 



■n+l; 
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The Method of Brunger-Brooks-Karplus (1982) (BBK) 



Pn+i/2 = (1 - 6t-y/2)p n - StVU(x n )/2 + ^5tk B T 1 M 1 / 2 R n /2; 
x n+1 = x n + <ft M~Vh-i/2; 
Pn+i = [Pn+i/2 - 5tVU(x n+1 )/2 + ^/5tk B T 1 M 1 / 2 R n+1 /2]/(l + Stj/2); 

Euler-Maruyama 

x n+1 = x n - 8tM- x VU{x n ) + v / 2k B T6tM- 1/2 R n ; 
'BAOAB' - Limit Method 



x n+1 = x n - 5tM~ 1 VU(x n ) + J^—M- l ' 2 (R n + R n+1 ); 
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